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Abstract 

This paper analyzes the performances of the Spearman's rho (SR) and Kendall's tau (KT) with respect to samples 
drawn from bivariate normal and bivariate contaminated normal populations. The exact analytical formulae of the 
variance of SR and the covariance between SR and KT are obtained based on the Childs's reduction formula for 
the quadrivariate normal positive orthant probabilities. Close form expressions with respect to the expectations of SR 
and KT are established under the bivariate contaminated normal models. The bias, mean square error (MSE) and 
asymptotic relative efficiency (ARE) of the three estimators based on SR and KT to the Pearson's product moment 
correlation coefficient (PPMCC) are investigated in both the normal and contaminated normal models. Theoretical 
and simulation results suggest that, contrary to the opinion of equivalence between SR and KT in some literature, 
the behaviors of SR and KT are strikingly different in the aspects of bias effect, variance, mean square error, and 
asymptotic relative efficiency. The new findings revealed in this work provide not only deeper insights into the two 
most widely used rank based correlation coefficients, but also a guidance for choosing which one to use under the 
circumstances where the PPMCC fails to apply. 
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I. Introduction 

Correlation analysis is among the core research paradigms in nearly all branches of scientific and engineering 
fields, not to mention the area of information theory fTl- lfTOl . Being interpreted as the strength of statistical 
relationship between two random variables ifTTIl . correlation should be large and positive if there is a high probability 
that large (small) values of one variable occur in conjunction with large (small) values of another; and it should 
be large and negative if the direction is reversed fl2l . A number of methods have been proposed and applied in 
the literature to assess the correlation between two random variables. Among these methods the Pearson's product 
moment correlation coefficient (PPMCC) [BJ, OH, Spearman's rho (SR) [BJ and Kendall's tau (KT) (BJ are 
perhaps the most widely used [16|. 

The properties of PPMCC in bivariate normal samples (binormal model) is well known thanks to the creative 
work of Fisher [13|. It follows that, in the normal cases, 1) PPMCC is an asymptotic unbiased estimator of the 
population correlation p, and 2) the variance of PPMCC approaches the Cramer-Rao lower bound (CRLB) with 
increase of the sample size IfTTIl . Due to its optimality, PPMCC has and will continue to play the dominant role when 
quantifying the intensity of correlation between bivariate random variables in the literature. However, sometimes 
the PPMCC might not be applicable when the following scenarios happen: 

1) The data is incomplete, that is, only ordinal information (e.g. ranks) is available. This situation is not 
uncommon in the area of social sciences, such as psychology and education [15); 

2) The underlying data is complete (cardinal) and follows a bivariate normal distribution, but is attenuated more 
or less by some monotone nonlinearity in the transfer characteristics of sensors |17|; 

3) The data is complete and the majority follows a bivariate normal distribution, but there exists a tiny fraction 
of outliers (impulsive noise) |18|-|20|. 

Under these circumstances, it would be more suitable to employ the two most popular nonparametric coefficients, 
SR and KT, which are 1) dependant only on ranks, 2) invariant under increasing monotone transformations ifBl . 
and 3) robust against impulsive noise 1211 . Now we are at a stage to ask the question: which one, SR or KT, should 
we use in Scenarios 1) to 3) where the familiar PPMCC is inapplicable? Unfortunately, however, despite the rich 
history of SR and KT, the answers to this question are still inconsistent in the literature. Some researchers, such 
as Fieller et al |22], preferred KT to SR based on empirical evidences; while others, such as Gilpin ||23l , asserted 
that SR and KT are equivalent. 

Aiming at resolving such inconsistency, in this work we investigate systematically the properties of SR and KT 
under the binormal model Il24ll - ll26l . Moreover, to deal with Scenario [3) mentioned above, we also investigate their 
properties under the contaminated binormal model |fT8l -[|20|. Our theoretical contribution is multifold. Firstly, we 
find a computationally more tractable formula of the variance of SR. Based on this formula, we provide the densely 
tabulated Table U with high precision (ten decimal places). This table overcomes the shortcomings of the existing 
power-series-based approximations that are tedious to use and of rather limited precision (up to five decimal places 
and for p < 0.8 only) ll22l . ||27l - l|29l . Secondly, we derive the exact analytical expression of the covariance between 
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SR and KT. With this new analytical result, we uncover a minor error in the literature [15|, |28|. Thirdly, we obtain 
the asymptotic expressions of the variances and hence the asymptotic relative efficiencies (AREs) concerning the 
three estimators of the population correlation p based on SR and KT. Finally, we find the asymptotic expressions 
with respect to the expectations of SR and KT under the contaminated normal model. 

The rest part of this paper is structured as follows. Section [TT] gives some basic definitions and summarizes the 
general properties of PPMCC, SR and KT. In Section [TTTJ we lay the foundation of the theoretical framework in 
this study by outlining some critical results in the binormal model. Section [IV] establishes, in the bivariate normal 
model, 1) the exact expression of the variance of SR, 2) two exact expressions concerning the covariance between 
SR and KT, and, 3) in the contaminated normal model, the closed form formulae associated with the expectations 
of SR and KT, respectively. In Section [V] we focus on the performances of the three estimators of p constructed 
from SR and KT. Section [VI] verifies the analytical results with Monte Carlo simulations. Finally, in Section I VIII 
we provide our answers to the above raised question concerning the choice of Spearman's rho and Kendall's tau 
in practice when PPMCC fails to apply. 

II. Basic Definitions and General Properties 

A. Definitions 

Let {(Xi,Yi)}2 =1 denote n independent and identically distributed (i.i.d.) data pairs drawn from a bivariate 
population with continuous joint distribution. Suppose that Xj is at the fcth position in the sorted sequence Xm < 
■ ■ ■ < X( n ^ . The number k is termed the rank of Xj and is denoted by Pj . Similarly we can get the rank of Yj 
which is denoted by Qj ifTTI . Let X and Y be the arithmetic mean values of Xi and Yi, respectively. Let sgn(A) 
stand for the sign of the argument k. The three well known classical correlation coefficient, PPMCC (rp), SR (r\g), 
and KT (Vr-), are then defined as follows |12|: 

n _ _ 
E (Xi - X) & - Y) 
r P (X, Y) 4 —-!=* — (1) 

n 9 n 

E £ (Yi-Y) 



2 

r ! i=l 

n 

6E(^-Q*) 2 

i=l 



r s (X,Y)±l- 1=1 2 (2) 

n n 

EEsgn(x J -x J )s g n(y l -y J ) 

tk{X, Y) 4 '±t± . (3) 

n{n — 1) 

To ease the following discussion, we will employ the symbol r\(X, Y), A G {P,S,K} as a compact notation for 
the three coefficients. For brevity, the arguments of r\(X, Y) will be dropped in the sequel unless ambiguity occurs. 

B. General Properties 

It follows that coefficients r\, A € {P, S, K} possess the following general properties: 
1) r x (X,Y) e [-1,1] for all (X, Y) (standardization); 
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2) r x (X,Y) =r x (Y,X) (symmetry); 

3) r\ = ±1 if Y is a positive (negative) linear transformation of X (shift and scale invariance); 

4) ts—tk= ± 1 if Y is a monotone increasing (decreasing) function of X (monotone invariance); 

5) The expectations of r\ equal zero if X and Y are independent (independence); 

6) r A (+, +) = = -r x (+, -) = r A (- -); 

7) r A converges to normal distribution when the sample size n is large. 

Note that the first six properties are discussed in [12] and [16|, and the last property follows from the asymptotic 
theory of [/-statistics established by Hoeffding [ 30 1 . 

C. Relationships Among PPMCC, SR and KT 

From their expressions (Hi-®, it appears that the three coefficients PPMCC, SR and KT are quite different. 
However, as demonstrated below, these three coefficients are closely related with each other. 

1) Daniel's Generalized Correlation Coefficient: Consider the n data pairs (JQ, Yi), i = 1, . . . ,n, at hand. To 
each pair of X's, (Xi,Xj), we can allot a score ay such that = — dji and an = 0. In a similar manner, we 
can also allot a sore fry to the ordered pair of Y's, (Yi, Yj). The Daniel's generalized coefficient T is then defined 
by ED 

n n 

S S a ijbij 

r 4 _ ( 4) 

(n n n n \ 2 

E E 4 E E % 
i=U=l i=ij=i / 

This general setup covers PPMCC, SR and KT as special cases with respect to different systems of scores [31 1: 
« Replacing <Zjj by Xj— Xi and bij by Yj—Yi in (|4]i gives the PPMCC rp defined in (HJ; 
« Replacing by Pj—Pi and 6^ by Qj—Qi in (|4|i gives the SR r$ defined in (fJJ; 
> Replacing ay by sgn(Xj-— Xj) and 6y by sgn(lj— Y^) in gives the KT tk defined in @. 

2) Inequalities between SR and KT: It is possible to state certain inequalities connecting the values of SR and 

KT based on a given set of n observations. The first one, ascribed to Daniel 11321 , is 

3(n + 2) 2(n+l) 
- 1 < 7T- rK ^-rs < 1 (5) 



which, for large n, becomes 



■1 < 3r K - 2r s < 1. 



The second one, due to Durbin and Stuat 1331, states that 



1 — V T<r 

rs<l- Tf * [(n-l)(l-r y ) + 4]. (6) 
2(n + 1) 
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Combing (O and (|6]l and letting n — > oo yield the bounds of SR, in terms of KT, as 

3 1 1 1 2 

~ 2 - rs - 2 +TK ~ 2 rK: rK ~° 

3 1 1 2 1 

2»"K + 2 > r s > 2 7 k + r ^ - 2' rA ' - °- 



3) Relationship of SR to Other Coefficients: Besides the PPMCC and KT, SR is also closely related to other 
correlation coefficients, e.g., the order statistics correlation coefficient (OSCC) (34)- 1 36 1 and the Gini correlation 
(GC) 1 37). In fact, SR can be reduced from the OSCC and GC by replacing the variates with corresponding 
ranks 11381 . 

III. Auxiliary Results in Normal Cases 

In this section we provide some prerequisites concerning the orthant probabilities of normal distributions. These 
probabilities, contained in Lemma [1] are critical for the development of Theorem Q] and Theorem [2] later on. 
Moreover, some well known results about the expectation and variance of PPMCC, SR and KT are collected 
in Lemma |2 for ease of exposition. For convenience, we use symbols E(A), V(a), C(A,4), and corr(A,4) in 
the sequel to denote the mean, variance, covariance, and correlation of (between) random variables, respectively. 
Symbols of big oh and little oh are utilized to compare the magnitudes of two functions u{k) and v(k) as the 
argument A tends to a limit L (might be infinite). The notation u(A) = 0(v(k)), A— s-L, denotes that \u(k)/v(k)\ 
remains bounded as A— >L; whereas the notation u(a) = o(u(A)), A— s-L, denotes that ti(A)/u(A)— s-0 as A—t-L |39|. 
Symbols of P^(Zi, . . . , Z m ) are adopted to denote the positive orthant probabilities associated with multivariate 
normal random vectors \Z\ ■ ■ ■ Z m ] of dimensions m = 1, . . . , 4, respectively. The notation R(g r s)mxm stands for 
correlation matrix with each element g rs = corr(Z r , Z s ), r, s = 1, . . . , m. Obviously the diagonal entries in R are 
all unities. For compactness, we will also use the symbol P^(R) to denote P^(Zx, . . . , Z m ) in the sequel. 

A. Orthant Probabilities for Normal Distributions 

Lemma 1: Assume that Z\, Z%, ^3, Z^ follow a quadrivariate normal distribution with zero means and correlation 
matrix R = (g r s) iX 4 : - Define 

[l (A >0) 

H(A) 4 ) (7) 
[0 (A<0). 
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where 



with 



i?(Zi)4E{ZT(Zi)} 



1 



1 



sin 1 Q12 



4 \ 7T 

P 3 °(Zi, Z 2 , Z 3 ) 4 E {H(Z 1 )H(Z 2 )H(Z 3 )} 

\ r— 1 s— r+1 / 

PSiZuZi, Z 3 , Z£) 4 E {H(Z 1 )H(Z 2 )H(Z 3 )H{Z 4 )} 



sin 1 g r 



tM 1+ ^E E 

\ r— 1 s= r+1 / 



+oo 



1 ffffexp (— ^zRz T ) 



1=2 



Z!Z 2 Z 3 Z4 



Qll 



■y- 

^ n2 JO [l-f?> 2 ] f 



dz\dz 2 dz 3 dz 4 



ae(u) 



du 



(8) 



(9) 



(10) 



(11) 



(12) 



(13) 



a2=Q3i~Q23Q2i~[Ql3Ql4 + Ql2{Ql2Q34:-eiiQ23-Ql3Q24:)]u 2 

a3 = Q2i-Q23Q34-[Ql2Ql4 + Ql3(Ql3Q24-ei<lQ23~ei2Q34)]u 2 

a±=Q23-Q2±Q34-[Ql2ei3 + Qll(Ql±Q23- Q13Q24— Ql2Q3±)]u 2 

1 

fa=P3= [1-£'23-(£'12 + £'13- 2 £'126 i 13£'23)m 2 ] 2 
72=/3 4 = [l-£ , 24-(£'l2 + £'l4~ 2 £'l26 l 14£ , 24)u 2 ] 5 
73 =74= [l-£'34-(£ , 13 + £ , 14- 2 £ , 136 l 14£'34)u 2 ] 2 ■ 

Proof: The first statement (O is trivial. The second one © is usually called Sheppard's theorem in the literature, 
although it was proposed earlier by Stieltjes [40|. The third one ( fTob is a simple generalization of Sheppard's theorem 
based on the relationship fill 

P 3=^[l-EA°(^)+E E P2(Zr,Z s ) . 
|_ r— 1 r— 1 s— r+1 

The last one (fTTT) is due to Childs [42 1 and is termed the Childs's reduction formula throughout. ■ 
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B. Some Well Known Results 



Lemma 2: Let {(Xj, Yi)}f =1 denote n i.i.d. bivariate normal data pairs with correlation coefficient p. Let rp, rs 
and r K be the PPMCC, SR and KT that defined in (Q)-®, respectively. Write S t = sin" 1 p and S 2 = sin" 1 \p. 
Then 



E(r P ) = p 



1 - 



I-P 2 



2n 



p as n — > oo 



V(r P j = 
E(r s ) 

E(rjf) 
V(r K ) = 



O (n- 2 ) 



( I-P 2 ) 2 

71 - 1 

6 

7r(n + 1) 
6 • -l P 

— sin — as n — > oo 

7T 2 

2 . -i 
■ sin p 



sin 1 /? + (n — 2) sin 1 — 



7T 



n(n — 1) 







G 


7T 2 ) 



(14) 

(15) 
(16) 
(17) 
(18) 
(19) 



Proof: The first three results, dT4b— (fT6b. were given by Hotelling ll43ll . Fisher |[T4"1 . and Moran l44l . respectively; 
whereas the last two results, dT8b and dl9l i. were derived by Esscher 



IV. Main Results in Normal and Contaminated Normal Models 

In this section we establish our main results concerning V(rs) and C(rs,rjc) in the normal model as well as 
E(rg) and E(r^) in the contaminated normal model. We start from revisiting Y(r$) in normal samples. Being 
the most challenging part and of fundamental importance for further development, the new discovery on V(rg) 
deserves to be formulated as a theorem. 



A. Variance of Spearman's rho 

Theorem 1: Let {(Xi, ^)}™ =1 , Si and S2 be defined as in Lemma [2] Write £ € {c, c?, e, /, g, h, I, m, n, o,p, q}. 
Let We be defined as in ( TT2l with respect to i?£ that tabulated in Table [TVl Then the variance of rg(X, Y) is 

6 9(n-2)(n-3) 



V(r s )= 



ra(fi+l) n(n 2 -l)(n+l) 
36 



(n-4)0 1 (p)+0 2 (p) 



7r 2 n(n 2 -l)(n+l) 



3(n-2)(3n 2 -15n+22)S 2 i 



-12(n-2) 2 SiS 2 -2(n-3)S 2 



(20) 



where 



Qi(p) =W c + 8W d + 2Wf 

Oa(p) = 6W 3 + 8W h + 6VK ; + 2V^„ + W + \. 



Moreover, when n is sufficiently large, 



V(r s ) =s - 
n 



9fii(p) 



324S 2 



(21) 
(22) 

(23) 
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Proof: See Appendix lAl ■ 
Remark 1: Unlike the Taylor-expansion-based approximate formulae in the literature 11221 . ||27ll - l|29ll , the ex- 
pression (f20b in Theorem Q] is exact for both the sample size n > 4 and the population correlation p e [—1,1]. 
However, due to the complicated integrals involved in the expressions of VF-terms in £l\{p) and the variance 

of rs cannot be expressed into elementary functions in general. In other words, we need to conduct numerical 
integrations based on Childs's reduction formula dT3b so as to calculate Sl\{p) and {l%(p) and hence V(rg) from 
d20l >. Nevertheless, exact results can be obtained for some particular cases. It can be shown that (Appendix IB1 



n x (o) = i, o 2 (o) = | 
n x (i) = i, n 2 (i) = y. 



Substituting p = and (fJH into (fJO]) leads directly to 



V M P= o 



n - 1 



(24) 
(25) 

(26) 



which is a well known result [15 |. Substituting p = 1 and d25l l into (|20l > and (|23l together with some simplifications 
yields 

V M P= i=° (27) 

which is of no surprise but, to our knowledge, has never been proven explictly in the literature (although indirect 
arguments can be found [38 1). Note that Y(rg) also vanishes for p = —1 due to symmetry. 



B. Covariance between Spearman's rho and Kendall's tau 

Besides the variance of SR just established in TheoremQ] the covariance between SR and KT is also indispensable 
for revealing the basic properties of the estimators to be discussed in Section IVl 

Theorem 2: Let {(Xj, Yi)}f =1 , Si and S% be defined as in Lemma [2] Then the covariance between rs(X,Y) 
and rx(X, Y) is 

12 



C(r s ,r K ) 



n(n 2 — 1) 



— — + (n-4 -|-5 (n-2 -f 

18 7T Z 7T^ 



6(n-2) 2 ^+(n-2)(n-3)n 3 (/>) 



12 

n 



n 3 (p)-6 



7T 
OI/S2 



7T 



(as n large) 



where 



Mp) = ~w g + w h . 



(28) 
(29) 

(30) 



Proof: See Appendix ICl ■ 
Remark 2: The technique employed in Appendix ICl can also provide an alternative proof of V(rfj-) in <0, by 
the relationship 
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The interested reader, after trying this, will find that the proof by this way is much simpler than the characteristic- 
function-based argument detailed in lfT31l . 

Corollary 1: In Theorem[2] the covariance C(rs,rx) can also be expressed as 

12 



C(rs,r K ) 



n(n 2 — 1 
-6(n-2) 



__ +(n _ 4 )--5(n-2)- 



2 S1S2 



-^(n-2)(n-3)0 4 (p) 



12 

n 



7T 

S1S2 



1 QJp) 

18 7T^ 7T 



(as n large) 



(31) 
(32) 



where 



4 (p) = 



sm — 



sm 




(33) 



4-2x 2 y/I^X 2 " 



Proof: Inverting dTTb yields 



W 



16P 4 ° 



1 



2 

TT 



(34) 



X] X! sin l£Vs - 

r— 1 s— r+1 

Combining ( f30b and d34l i, ^(p) can be rewritten in terms of P° and the correlation coefficients corresponding to 
R g and i?;, in Appendix 2 of ||29ll . This leads to 

Mp) = + 4 n *00- (35) 

18 7T Z 

The corollary thus follows directly by substituting (l33T > to d28l i and d29l ), respectively. ■ 
Remark 3: Both d28l ) and (T3TT > are erocf for any value of n > 4 and |p| < 1. However, they are of different 
usefulness according to different numerical and analytical purposes. Formula d28T i is more convenient in the sence 
of controlling the precision of numerical integrations when programming; whereas OTI ) is more convenient in the 
sence of evaluating any order (> 1) of derivatives of C(rg, tk) with respect to p. These higher order derivatives are 
mandatory when expanding C(r$, tk) as a power series in p, a conventional practice in the literature. For example, 
performing the Taylor expansion to d32b with the assistance of d33l gives 



<C(r s ,r K ) 



— (l - 1.24858961p 2 + 0.06830496p 4 
in 



0.07280482p b + 0.04025528p 8 + 0.02189277/? 1 



(36) 
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which agrees with the formula (51) obtained in (28), except for the coefficients of the last two terms, which we 
find to be 0.04025528 and 0.02189277, against their 0.04025526 and 0.01641362, respectively. Since tl 4 (p) in (ED 
is exact , we believe that (f36t is more accurate than (51) in [28 1. Unfortunately, even (ESJl is too coarse when n 
is small and/or \p\ is large. To satisfy the requirments of the current study, we prefer to the f^^-based formula 
(ESI , which can provide numerical results to any desired decimal place. For convenience of us as well as other 
researchers, a densely tabulated table, Table [II] for (p) with ten-place accuracy is provided in Section [VI] 

Remark 4: Due to the complicated integrals involved in ( |28l and OD , C(rs,r^) cannot be expressed in 
elementary functions. However, exact results are attainable for p = and p = 1 (—1). It follows that (Appendix IbTi 

fia(l) = \ (38) 



Substituting (ED into (EH) yields 



C(rs,rx)l = |-^ (39) 
' p - v 3n n — 1 



which is more readily to obtain on substitution of p = into (ED- Regarding the case for p = 1, it is rather difficult 
by means of substituting p = 1 into (ED an d evaluating ^4(1) based on (1331 thereafter. Fortunately, with the help 
of d38l , it follows readily from ( |28l and (|29l that C(rs,rx)\ p _ 1 — which, again, is of no surprise but, to our 
knowledge, has never beed explictly proven in the literature. Due to symmetry, we also have C(rg, rfr)| — _i = 0. 

C. E(r<j) and E(rR-) in Contaminated Normal Model 

The PPMCC is notoriously sensitive to the non-Gaussianity caused by impulsive contamination in the data. Even 
a single outlier can distort severely the value of PPMCC and hence result in misleading inference in practice. 
Assume that (X, Y) obeys the following distribution ||2T1 

eAf (px,V>y,Ox,OyiP) + ^ (Mx , , ^x^x Ay<*y > p') ( 40 ) 

where e = 1 — e, < e < 1, Ax ^> 1, and Ay ^> 1. Under this Gaussian contamination model that frequently used 
in the literature of robustness analysis [18|-[20|, it has been shown that, no matter how small e is, the expectation 
of the PPMCC E(rp) — > p' as \x — > 00 and Ay — > 00 [21|. On the other hand, as shown in the theorem below, 
SR and KT are more robust than PPMCC under the model d40l . 

Theorem 3: Let {(JQ, Yi)}" =1 be i.i.d. samples generated from the model (l40l . Let and tk be the SR and 
KT defined in (ID and (EJ, respectively. Then 

2 

lim E(r K ) = - [(l-2e)sin" 1 p + 2esin" 1 /7'l (41) 

Ax —^00 

Ay ^-OC 



G 



e->0 7T 
n— >oo 
Ax —^00 

Ay— >OQ 



lim E(r s ) = - (1 - 3e) sin -1 - + e sin -1 



2 



(42) 



Proof: See Appendix iDl 
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Remark 5: It was stated without substantial argument in [21 1 that, under the model d40i >, E(rg) is of the following 
form 

fiT , n , n' 

(*) 



Krs) = - 

7T 



(l-e)sm 2 +eSm 2" 



as e — > 0, Ax — > oo and Ay — > oo. This is quite inconsistent with our result (l42l in Theorem [3] We will resolve 
the controversy between d42l and (*) by Monte Carlo simulations in Section [VI] 

V. Estimators of the Population Correlation 

In this section, we investigate the performance of the estimators of p based on SR and KT in terms of bias, MSE 
and ARE to PPMCC. To gain further insight into their relationship, the correlation between the two estimators ps 
and pK (defined below) is also derived. 

A. Asymptotic Unbiased Estimators 

Inverting (TBi i. (fTTT i and (fLSl l, we have the following estimators of p 



p P 4 r P (43) 
/ 5 s ^2sin(^r 5 ) (44) 

GhO • (45) 



/5K- = sin | 



Moreover, another estimator based on a mixture of r$ and rp can be constructed as |fT31 

6 rS ~ 2 n-2 



p M = 2 sm -r s - — (46) 



E(r s ) = - f ft + . (47) 



based on the following relationship 

FJro) = 

tt \ n + 1 

In the sequel we will focus on the properties of the estimators defined in d43l-(|4"6i>. Here the estimator pp in 
is employed as a benchmark due to its optimality for normal samples, in the sense of approaching the CRLB ifTTl 
when the sample size is sufficiently large. 

B. Bias Effect for Small Samples 

It is noteworthy that the four estimators in (|43l-(|46li are unbiased only for large samples. When the sample size 
is small, the bias effects, as shown in the following theorem, are not ignorable any more. 

Theorem 4: Let p ( , ( e {P, S, K, M} be defined as in (l43T>-d46l>. respectively. Define BIAS C = E(p c - p). Let 
Si and ft bear the same meanings as in Lemma [2] Write <r| = V(rs), a\ = Y(rx) and os,k — C(rs,rRr). 
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Then, under the same assumptions made as in Theorem Q] 



BIASp- --L^l-p 2 ) (48) 
2n 



n + 



BIASs * v , [ (Si - 3S 2 ) - j^-a 2 s (49) 



2 

BIAS if ~-^(7| (50) 

BIASm * -^7^2 [(n+l) 2 <T 2 s -6(n+l)a s ,K+9<T 2 K ] ■ (51) 

Proof: The first statement d48T > follows directly from ([Pil l in Lemma 12 Now we proceed to evaluate BIAS5, 

BIASk and BIASm- For convenience, write Ys=E(rs), Tk—E^k), Ss=rs—rs, and &k—tk — Tic- Expanding 
(l44l around yields 

=2 sin (Its) +f cos S s -^ sin d|+ • ■ • . (52) 



Ps= 



Taking expectation of both sides in (l52l l. applying E(<5s) = 0, E(S S ) = cr| and ignoring the high order infinitesimals, 
we have 

2 

E(p s ) * 2 sin (^rij - ^ sin (~f^) <r 2 s . (53) 

Substituting (|47l i into (|53l l, expanding to the order of (n + and subtracting p thereafter, we obtain the result 
( |49l . In a similar way we have 

rev- ^ 7I ' 2 ' 2 

which leads directly to d50l l. Performing Taylor expansion of PM{ r Si r K) around (Ts,Tk) till the second order, we 
have 

- / x . d(p M ) x . d(p M ) x 

PM = PM(rs,r K ) + -57 — rOs + -tt, rOR 

d(r S ) d(r K ) 

1 1 \ d 2 (p M ) 2 d 2 (p M ) 2 2d 2 (p M )S s S K ] 
+ 2 [ d(r s ) 2 S+ d{r K y ° K+ d(r s )d(r K ) \ + ' ' ' ' 

Taking expectation of both sides in d54b . ignoring high order infinitesimals, applying the results pm (rs, r~K ) = p, 

E(^s) = 0, E(Sk) = 0, E(8 S ) = cr|, E(S 2 < ) = <j 2 k , E(S s ,S k ) = &S.K along with the second order partial 
derivatives 

d 2 p M (rs,rT) _ n 2 p(n + l) 2 
d(r s ) 2 ~ 36 (n-2) 2 
d 2 pM(rs,TW) ir 2 p 1 



d(r K ) 2 4 (n-2) 2 

d 2 PAt(rs, rgO _ 7r 2 p n + 1 
9(rs)5(r x ) ~ 12 (n - 2) 2 

and subtracting p thereafter, we arrive at the forth theorem statement dBTl ). thus completing the proof. 

Remark 6: From (|48T> — (TsTb. it follows that, for all the four estimators, 

. BIAS c (p) = BIAS c (-/9) (odd symmetry); 

. pBIASf (p) < (negative bias); 
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. BIAS C = for p e {-1,0,1}; 

. BIAS C — 0(n _1 ) as n -t oo. 
Moreover, contrary to BIASp and BIAS^ , BIAS5 and BIAS m cannot be expressed into elementary functions due 
to the intractability involved in d20l > and d28l >. the expressions of V(rg) and C(r5,r^), respectively. 

C. Approximation of Variances 

Besides the bias effect just discussed, the variance is another important figure of merit when comparing the 
performance of the estimators pj , £ e {P, S, K, M}. From ( [Pil l, it follows that 

(1-p 2 ) 2 

W(p P )c ( - ?J-. (55) 

71—1 

By the delta method, it follows that lfT31 

V(p s ) * \/ J V(r s ) (56) 
36 

W(p K ) ~ ^^"^ VCrjc). (57) 

Now we only need to deal with V(pm), which is stated below. 

Theorem 5: Let pm be defined as in d46i i. Then, under the same assumptions made as in Theorem Q] 

V(/3m) * ll^Z^l [(" + 1) 2 4 - 6(n + l)v 8 ,K + 9*$-] . (58) 



Proof: Using the delta method ifTTl . it follows that 



V(p 



M) 



d(pAi) 



1 2 



d(r s ) 



a 2 s + 



0(pm) 



1 2 



d{r K ) 



2 , n d(PAi) (KM _ ™ 

O k +1—, T"H7 \ a S,K- (59) 



9(rs) 9(rif) 



The theorem thus follows with substitutions of the partial derivatives 



dpAi(rs, tk) _ 7T n+ 1 / , ^ 



d(rs) 6 n — 2 

dpAijrs, 7k) _ K -1 

9(r K ) ~ 2 n - 2 

into d59l l, respectively. 



,2 



Z). Asymptotic Relative Efficiency 

Thus far in this section we have established the analytical results with an emphasis on limited-sized bivariate 
normal samples. For a better understanding of the fourt estimators, we will shift our attention to the asymptotic 
properties of p^ in the sequel. Since lim, woo E(pf) = p, we can compare their performances by means of the 
asymptotic relative efficiency, which is defined as IfTTl 

ABE C 4 lim ^tef, (e{P,S,K,M}. (60) 

n-s-oo V(pfJ 

As remarked before, we employ pp as a benchmark, since, for large-sized bivariate normal samples, pp approaches 
the Cramer-Rao lower bound (CRLB) 0T] 

(1 - P 2 ) 2 

CRLB = ^ (61) 
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From d60l ) it is obvious that AREp = 1. Moreover, comparing d56l l and ( |58l l, it is easily seen that lim n _>. 00 Y(ps)/^(pm) 
1, which leads readily to ARE5 = ARE a/ by referring to (f60b . Then we only need to focus on ARE 5 and ARE a 
in the following discussion. 

Theorem 6: Let ARE5 and AREa be defined as in d60t , Then 

ARE S = r 36(1 ~ P2? , (62) 



(4-p2 



97r 2 f2i(p) - 324 (sin" 1 ±p 



AREa = 9(1 p2) 5 . (63) 

tt 2 -36 (sin- 1 !/)) 

Proof: Substituting (T56b and 07] ) into ( f60b yields d62l ) and d63l ). respectively, and the proof completes. ■ 
Remark 7: Due to the intractability of Qi(p) in (l62l . ARE5 cannot be expressed into elementary functions in 
general. However, exact results are obtainable for p = 0,±1. Substituting p = and fii(O) = 1/9 into (1621 . it is 
easy to verify that 

AREs(O) = 4r ~ 0.9119 

which is a well known result [15j. In our previous work 11381 we also obtained that 

ARE S (±1) = 15 + * 1V ^ ^ 0.6947. (64) 

Now let us investigate AREa- It follows from d63l that, AREa is expressible as elementary functions of p, and 
is therefore more tractable than ARE5. In other words, we can evaluate easily any value of AREk with respect 
to any value of p ^ ±1. For example, substituting p = into d63l yields 

AREk(O) = ^ 

which is identical to AREs(O) and also well known lfT31l . However, when p ±1, an extra effort is necessary, 
since both the numerator and denominator of d63l vanish in this case. Apply the L'Hopital's rule, we find the 
following result 

AREifi 



Ip^^p 2 



l ^ ±1 4 sirT 1 \p 



2vr 

P =±i 



0.8270 (65) 



2' 

which is greater than AREg(±l). In fact, a comparison of ARE5 and AREa' in Section |VI] suggest that AREg < 
ARE a' for all p€ [-1,1]. 

VI. Numerical Results 

In this section we aim at 1) tabulating the values of Qi(p), Q2O0) ( m Theorem [TJ and ^(p) (in Theorem |2|l 
that are not expressible as elementary functions, 2) verifying the theoretical results Theorems Q] to [6] established in 
previous sections, and 3) comparing the performances of the four estimators defined in (|43l>— d46b by means of bias 
effect, mean square error (MSE) and ARE under both the normal and contaminated normal models. Throughout this 
section, Monte Carlo experiments are undertaken for 10 < n < 100. A sample size of n — 1000 is considered large 
enough when we investigate the asymptotic behaviors. The number of trials is set to 5 x 10 5 for reason of accuracy. 
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TABLE I 

Values of f2i(p) and H 2 (p) in Theorem[JJ 



p 0.00+ 0.10+ 0.20+ 0.30+ 0.40+ 0.50+ 0.60+ 0.70+ 0.80+ 0.90+ 

0.00 0.1111111111 0.1185038469 0.1408155407 0.1784569533 0.2321489296 0.3029841008 0.3925307865 0.5030051934 0.6375648509 0.8008401854 
0.01 0.1111849294 0.1200591282 0.1438805317 0.1830890232 0.2384392872 0.3110659830 0.4025930108 0.5153148070 0.6525067154 0.8189917929 
0.02 0.1114063976 0.1217636535 0.1470991369 0.1878822020 0.2449020129 0.3193363285 0.4128663989 0.5278679000 0.6677394272 0.8375074957 
0.03 0.1117755552 0.1236177309 0.1504719575 0.1928374325 0.2515384762 0.3277970726 0.4233536847 0.5406684104 0.6832688919 0.8563967883 
0.04 0.1122924682 0.1256216966 0.1539996256 0.1979556956 0.2583500955 0.3364502180 0.4340577002 0.5537204299 0.6991012791 0.8756696820 
0.05 0.1129572289 0.1277759147 0.1576828058 0.2032380114 0.2653383401 0.3452978372 0.4449813799 0.5670282120 0.7152430387 0.8953367468 
0.06 0.1137699564 0.1300807778 0.1615221950 0.2086854397 0.2725047311 0.3543420748 0.4561277650 0.5805961801 0.7317009189 0.9154091574 
0.07 0.1147307963 0.1325367071 0.1655185233 0.2142990814 0.2798508436 0.3635851506 0.4675000084 0.5944289365 0.7484819855 0.9358987450 
0.08 0.1158399207 0.1351441527 0.1696725544 0.2200800791 0.2873783079 0.3730293618 0.4791013791 0.6085312723 0.7655936432 0.9568180554 
0.09 0.1170975289 0.1379035942 0.1739850867 0.2260296186 0.2950888115 0.3826770865 0.4909352680 0.6229081771 0.7830436585 0.9781804140 



0.00 0.5555555556 0.5831779199 0.6669411548 0.8096548728 1.0164469967 1.2956050434 1.6602428005 2.1318398440 2.7490961353 3.5988067890 

0.01 0.5558310525 0.5889973638 0.6784952270 0.8273423444 1.0409353198 1.3279500727 1.7021306947 2.1861334471 2.8213687297 3.7041428194 

0.02 0.5566576312 0.5953785532 0.6906408759 0.8456746565 1.0661538717 1.3611599188 1.7451031281 2.2419045230 2.8959717933 3.8148964800 

0.03 0.5580355547 0.6023235637 0.7033822725 0.8646587057 1.0921135054 1.3952518608 1.7891893159 2.2992092121 2.9730475022 3.9318774005 

0.04 0.5599652624 0.6098346629 0.7167238250 0.8843017153 1.1188255760 1.4302440445 1.8344202199 2.3581081295 3.0527562596 4.0561698333 

0.05 0.5624473698 0.6179143137 0.7306701849 0.9046112480 1.1463019652 1.4661555384 1.8808286962 2.4186669028 3.1352804112 4.1892949503 

0.06 0.5654826698 0.6265651776 0.7452262544 0.9255952194 1.1745551090 1.5030063944 1.9284496603 2.4809568021 3.2208290418 4.3335266816 

0.07 0.5690721334 0.6357901183 0.7603971935 0.9472619127 1.2035980265 1.5408177141 1.9773202724 2.5450554807 3.3096442738 4.4925913045 

0.08 0.5732169115 0.6455922055 0.7761884285 0.9696199936 1.2334443518 1.5796117219 2.0274801460 2.6110478526 3.4020096999 4.6735571704 

0.09 0.5779183355 0.6559747193 0.7926056602 0.9926785275 1.2641083679 1.6194118448 2.0789715836 2.6790271403 3.4982619270 4.8941554764 



(a) In the upper panel are the values of Qi(p), and in the lower panel (shaded area) are the values of C 2 (p); 

(b) Qi(0) = 1/9, Qi(l) = 1, ni(-p) = Qi(p); 

(c) n 2 (o) = 5/9, n 2 (i) = 16/3, n 2 (-p) = n 2 (p). 



TABLE II 
Values of f2 3 (p) in Theorem[2] 



p 0.00+ 0.10+ 0.20+ 0.30+ 0.40+ 0.50+ 0.60+ 0.70+ 0.80+ 0.90+ 

0.00 0.0555555556 0.0579082728 0.0650488362 0.0772363477 0.0949464623 0.1189551518 0.1505074640 0.1916842312 0.2463473665 0.3235686523 
0.01 0.0555790160 0.0584040661 0.0660345125 0.0787487694 0.0970477974 0.1217448819 0.1541471283 0.1964547749 0.2528155359 0.3333572518 
0.02 0.0556494053 0.0589477686 0.0670708481 0.0803167902 0.0992127259 0.1246109820 0.1578844390 0.2013621185 0.2595091876 0.3437145088 
0.03 0.0557667477 0.0595395710 0.0681582284 0.0819410523 0.1014422703 0.1275551071 0.1617222612 0.2064119567 0.2664434835 0.3547334972 
0.04 0.0559310834 0.0601796821 0.0692970610 0.0836222291 0.1037375018 0.1305789987 0.1656636397 0.2116104637 0.2736356357 0.3665400166 
0.05 0.0561424692 0.0608683287 0.0704877763 0.0853610263 0.1060995431 0.1336844907 0.1697118152 0.2169643536 0.2811053381 0.3793122556 
0.06 0.0564009777 0.0616057558 0.0717308284 0.0871581833 0.1085295710 0.1368735157 0.1738702420 0.2224809499 0.2888753252 0.3933192258 
0.07 0.0567066983 0.0623922274 0.0730266953 0.0890144747 0.1110288192 0.1401481121 0.1781426083 0.2281682700 0.2969721099 0.4090062849 
0.08 0.0570597366 0.0632280263 0.0743758804 0.0909307119 0.1135985821 0.1435104314 0.1825328587 0.2340351233 0.3054269741 0.4272272392 
0.09 0.0574602150 0.0641134550 0.0757789125 0.0929077445 0.1162402177 0.1469627471 0.1870452202 0.2400912305 0.3142773322 0.4501481060 



Note that (a) n 3 (0) = 1/18, (b) fi 3 (l) = 1/2, and (c) fi 3 (-p) = fi 3 (p). 



All samples are generated by functions in the Matlab Statistics Toolbox™. Specifically, the normal samples 
are generated by mvnrnd, whereas the contaminated normal samples are generated by gmdistribution and 
random. The notation p — pi(Ap)p2 represents a list of p starting from p\ to pi with increment Ap. 

A. Tables of fli(p), ^(p) and ^(p) 

Table U contains the values of tli(p) and ^(p) in ( l20l l. the first statement of Theorem [T] for p = 0(0.01)1. In 
the upper panel are the values of whereas in the lower panel are the values of ^(p). Due to the importance 
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(a) n = 10 (b) n = 20 




P P 



BIAS? - 


- BIASj 


BIAS? ■ 


BIAS?, 


• BIASp 


■ BIAS? 


► BIAS£ 


♦ BIASm 



Fig. 1. Comparison of BIAS,j, ( 6 {P, S, K, M} for (a) n = 10 and (b) n = 20. Theoretical curves, denoted by BIAS^ in the legend, are 
plotted over p = —1(0.01)1 based on j48M 5lt. respectively; whereas the simulation results, denoted by BIASj? in the legend, are plotted 
over p = -0.9(0.1)0.9. 



of V(rs) both in theory and in practice, the table is made as intensive and accurate as possible, with the increment 
Ap being 0.01, and the precision being up to ten decimal places. In Table ITU are tabulated the values of Cls(p) in 
(|28T > of Theorem |2] for p = 0(0.01)1. Because of the similar reasons, the increment Ap and precision are the same 
as those in TableUJ The values of ^(p) and ^(p) with repect to p not included in Tables HI and ITU can be 

easily obtained by interpolation. Given these tables, we can easily calculate the quantities that depend on £li(p), 
Q 2 (p) and Q 3 (p), including V(r s ), V(p s ), V(p M ), BIAS(p s ), BIAS(pm), ARE s , ARE m , and so forth. 

B. Verification of BIAS^ andY(p^) in Small Samples 

Fig. [JJ shows the bias effects BIAS(j corresponding to the four estimators p^, ( E {P, S, K, M} for n ~ 10 and 
n = 20, respectively. It is clearly observed that the magnitudes of BIAS^ can be ordered as BIASa/ < BIASp < 
BIASk < BIAS5. That is, the performance of r$ is much worse than those of the other three in terms of bias 
effect in small samples. Moreover, it is also observed that d49b and ( |5T"1 > with respect to BIASs and BIASAf are 
more accurate than (l48l l and ( T50b with respect to BIASp and BIAS^. In other words, the former two formulae 
agree better than do the latter two formulae with the corresponding simulation results for a sample size n as small 
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(a) n = 10 (b) n = 20 




Fig. 2. Comparison of observed MSE C , C £ {P, S, K, M} for (a) n = 10, (b) n = 20, (c) n = 40, and (d) n = 60 over p = -1(0.1)1, 
respectively. 



as 10. Nevertheless, the deviations from (l48b and ( f50b to the corresponding simulation results are less noticeable 
when the sample size n is increased up to 20. 

Table HII] lists, for each of the three samples sizes, 10, 20 and 30, 1) the theoretical results 05]l-08l) with respect 
to ~V(f>() and 2) the corresponding observed variances from the Monte Carlo simulations. It can be seen that ( f56b 
and 08] ). with respect to Y{ps) and Y(pm), are accurate enough even though the sample size is as small as n = 10. 
On the other hand, unfortunately, the theoretical formula ( f55b for N(pp) and 07] ) for Y(pk) deviate substantially 
from the corresponding observed simulation results for the same sample size n = 10. However, it appears that 
these deviations become less noticeable for n = 20 and negligible for n = 30. Therefore, it would be save to use 




0.00 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 1.00 

P 



Fig. 3. Verification and Comparison of AREg and ARE^ for n = 1000 over p = 0(0.01)1, for theoretical curves, and p = 0(0.05)0.95, 
for simulation results. The results 1641 and i65\ are used to plot the two theoretical curves for p = 1, respectively. 



TABLE III 

Variances of V(p c ), f e {P, S, K, M} for n = 10, 20, 30 



Sample size n — 10 Sample size n — 20 Sample size n — 30 

V(pp) V(ps) V(PA-) V(pm) V(pp) V(ps) V(pjc) V(pm) V(pp) V(ps) V(pg) V(pm) 
p (53) Obs. (54) Obs. (55) Obs. (56) Obs. (53) Obs. (54) Obs. (55) Obs. (56) Obs. (53) Obs. (54) Obs. (55) Obs. (56) Obs. 



0.0 0.111 0.111 0.122 0.119 0.152 0.133 0.148 0.143 0.053 0.053 0.058 0.057 0.065 0.061 0.064 0.063 0.034 0.034 0.038 0.037 0.041 0.039 0.040 0.040 

0.1 0.109 0.109 0.120 0.117 0.150 0.131 0.145 0.141 0.052 0.052 0.057 0.056 0.064 0.060 0.063 0.062 0.034 0.034 0.037 0.037 0.040 0.039 0.040 0.039 

0.2 0.102 0.105 0.115 0.112 0.142 0.125 0.138 0.135 0.049 0.049 0.054 0.054 0.060 0.057 0.059 0.059 0.032 0.032 0.035 0.035 0.038 0.037 0.038 0.037 

0.3 0.092 0.096 0.106 0.105 0.129 0.117 0.127 0.125 0.044 0.045 0.049 0.049 0.055 0.052 0.054 0.054 0.029 0.029 0.032 0.032 0.034 0.034 0.034 0.034 

0.4 0.078 0.085 0.094 0.094 0.113 0.104 0.112 0.111 0.037 0.039 0.043 0.043 0.047 0.046 0.047 0.047 0.024 0.025 0.028 0.028 0.030 0.029 0.030 0.030 

0.5 0.063 0.071 0.080 0.081 0.093 0.089 0.093 0.094 0.030 0.032 0.036 0.036 0.039 0.038 0.039 0.039 0.019 0.020 0.023 0.023 0.024 0.024 0.024 0.025 

0.6 0.046 0.055 0.063 0.065 0.070 0.071 0.073 0.074 0.022 0.024 0.028 0.028 0.029 0.029 0.030 0.030 0.014 0.015 0.018 0.018 0.018 0.018 0.019 0.019 

0.7 0.029 0.038 0.046 0.048 0.048 0.052 0.051 0.053 0.014 0.016 0.019 0.020 0.019 0.020 0.020 0.021 0.009 0.010 0.012 0.012 0.012 0.012 0.012 0.013 

0.8 0.014 0.021 0.028 0.030 0.026 0.031 0.029 0.031 0.007 0.008 0.011 0.011 0.010 0.011 0.011 0.011 0.004 0.005 0.007 0.007 0.006 0.007 0.007 0.007 

0.9 0.004 0.007 0.012 0.013 0.009 0.012 0.011 0.012 0.002 0.002 0.004 0.004 0.003 0.004 0.004 0.004 0.001 0.001 0.002 0.002 0.002 0.002 0.002 0.002 
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(|55|>-(|58|| when approximating the variances of for n > 20 in practice. 

C. Comparison of MSE in Small Samples 

Contrary to BIAS^ illustrated in Fig. [T] the magnitudes of the mean square errors 

MSE C 4 E [(p c - pf] ,Ce{P, S, K, M} 

cannot be ordered in a consistent manner. It appears in Fig. [2] that 1) MSEaj > MSE^ > MSEg > MSEp when 
\p\ is around 0, 2) MSEg > MSE^ > MSEp when \p\ exceeds some threshold, which moves towards with 
increase of n, and 3) the difference between MSE^- and MSEg around p = decreases steadily with increase of 
n. Furthermore, due to the asymptotic equivalence between ,05 and pm, MSEg and MSEa/ becomes closer and 
closer as n increases. 

D. Verification and Comparison of ARE5 and ARE A 

Fig. |3]verifies and compares the performance of ps and pk in terms of ARE. For purpose of numerical verification, 
simulation results for n — 1000 are superimposed upon the corresponding theoretical curves. Due to the asymptotic 
equivalence between ps and pm, the results with respect to ARE a/ are not included in Fig. [3] It can be observed that 
1) the simulations agree well with our theoretical findings in d62l and d63l . respectively, 2) AREk lies consistently 
above ARE5, indicating the superiority of px over ps for large samples, and 3) the performance of ps deteriorates 
severely as p approaching unity, although it performs similarly as px when p is small. Note that the remarks on 
ARE 5 also apply to ARE a/ due to the equivalence between pg and pM when the sample size n is large. 

E. Performance of p^ in Contaminated Normal Model 

Fig.lHpuports to 1) verify the two statements concerning E(r^) and E(rs) in Theorem [3] under the contaminated 
Gaussian model d40b . and 2) compare our formula d42l with the result of (*) that asserted in Ell . Due to the lack 
of space, we only present the results for e = 0.01 and e = 0.05 under the sample size n = 50 here. For simplicity, 
the rest parameters of the model d40b are set to be ax = ay = 1, Ax = Xy = 100 and p' = throughout. It is 
seen that the observed values of E(fR-) and E(rg) agree well with the corresponding theoretical results of d4Tb and 
d42l established in Theorem [3] On the other hand, however, the curves with respect to (*), especially in Fig.|4|b), 
deviate obviously from the corresponding observed values. 

Fig. [5] illustrates, in terms of MSE, the sensitivity of pp as well as the robustness of ps, pk and pu to 
impulsive noise. It is shown in Fig. [5] that the MSE of pp is dramatically larger than those of the other three 
estimators, irrespective of how small the fraction e of impulsive component is. On the other hand, it is seen that, 
despite some minor negative (positive) differences for p around (±1), MSEg and MSEa/ behave similarly with 
MSE K for e = 0.01. Nevertheless, MSE S and MSE M are much larger than MSE A - for e = 0.05 when p falls 
in the neighborhood of ±1. Combing Fig. |3a) and (b), it would be reasonable to rank their performance as 
Pk > Ps ~ Pm 3> pp in terms of MSE under the contaminated normal model d40b . where the symbol ~ stands 
for "is similar to". 
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(a) e = 0.01 (b) e = 0.05 




P P 



Fig. 4. Verification of Theorem [5] for (a) e = 0.01 and (b) t = 0.05 under the sample size n = 50 over p = (—1)0.1(1), for simulations, and 
p = (—1)0.01(1), for theoretical formulae 14 It and d42t . The rest parameters of the model d40t are set to be ax = cry = 1, \x = Ay = 100 
and p' = 0, respectively. The formula (*) concerning E(rg) developed elsewhere BP is also included for comparison. 



(a) e = 0.01 (b) e = 0.05 




-1 -0.5 0.5 1 -1 -0.5 0.5 1 



Fig. 5. Performance comparison it terms of MSE<% G {S, P, K, M}, over p = —1(0.1)1 in the contaminated normal model 440t for (a) 
e = 0.01 and (b) e = 0.05 under the sample size n = 50. The rest parameters of the model are set to be ax = ay = 1, Ax = Ay = 100 
and p' = 0, respectively. 
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VII. Concluding Remarks 

In this paper we have investigated systematically the properties of the Spearman's rho and Kendall's tau for 
samples drawn from bivariate normal contained normal populations. Theoretical derivations along with Monte Carlo 
simulations reveal that, contrary to the opinion of equivalence between SR and KT in some literature, e.g. 11231 . they 
behave quite differently in terms of mathematical tractability, bias effect, mean square error, asymptotic relative 
efficiency in the normal cases and robustness properties in the contaminated normal model. 

As shown in Theorem [1] SR is mathematically less tractable than KT, in the sense of the intractable terms 
fii(p) and f2 2 (p) m tne formula of its variance d20l ), in contrast with the closed form expression of V(r^-) in ( fT9l ). 
However, this mathematical inconvenience is, to some extent, offset by Table U provided in this work, especially 
from the viewpoint of numerical accuracy. Moreover, as demonstrated in Fig. Q] and Table |IIIJ the convergence 
speed of the asymptotic formulae ( T50T ) and 07] ) with respect to BIASa' and V(pk) are less accurate than those of 
BIASs and V(/5s) due to the high nonlinearity of the calibration (05]). As a consequence, we do not attach too 
much importance to such mathematical advantage of KT over SR. 

Now let us turn back to the question raised at the very beginning of this paper: which one, SR or KT, should we 
use in practice when PPMCC is inapplicable? The answer to this question is different for different requirements of 
the task at hand. Specifically, 

1) If unbiasedness is on the top priority list, then neither ps or pK should be resorted to. The modified version 
Pm that employs both SR and KT, is definitely the best choice (cf. Fig. [TJ(. 

2) One the other hand, if minimal MSE is the critical feature and the sample size n is small, then ps (pk) 
should be employed when the population correlation p is weak (strong) (cf. Fig. 

3) Since pk outperforms ps asymptotically in terms of ARE, then px is the suitable statistic in large-sample 
cases (cf. Fig. [3}. 

4) If their is impulsive noise in the data, then it would be better to employ px, in terms of MSE, although there 
is some minor advantage of ps when p is in the neighborhood of (cf. Fig. [5}. 

5) Moreover, in terms of time complexity, ps appears to be superior to pk — the computational load of the 
former is O(nlogn); whereas and the computational load of the latter is 0(n 2 ) [35 1 . 

Possessing the desirable properties summarized in Section [TTJ Spearman's rho and Kendall's tau have found wide 
applications in the literature other than information theory. With the new insights uncovered in this paper, these two 
rank based coefficients can play complementary roles under the circumstances where Pearson's product moment 
correlation coefficient is no longer effective. 
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TABLE IV 

Quantities for evaluation of E(5 2 ) in Theorem|T] 



Coit. Matrix 


No. of Terms* 




Representative Term 






Correlation Coeffcients 




p°(z 1 ,z 2 ,z 3 ,Z4Y 


W(0) 


W(l) 








z 2 


Zz 


^4 




£>13 


£14 


£23 


Q24 


P34 


R a 


„[6] 


A 1 A 2 


Y1-Y3 


X4—X5 


J 4 1 


2' 














2P 


1 1 S2 1 $2 
16" r 4ir ~ r 47r 2 ' 






Hi 

1 1 6 




Xi X2 


Y\— Y 2 


X3—X4 


J 3 J 5 


p 














•2.P 


1 1 Si , 52 1 S1S2 
16 T 87i T 8ir" T 17 r 
5 ,S 2 ,W c {p) 
48 T 27r" r 16 






Rr 




1 2 


Y1-Y3 


Xi— X4 


Y1—Y5 


-p 

iP 


2 


y 


y 


2 


-P 

2' 


1 

!) 


1 

5 


R i 




Al A2 


Yi-Yz 


X2— X4 


Vo—Vr 

*2 *5 


in 

2< J 


1 

~ 2 


-y 








2^ 


1 ,S 2 ,W d (p) 
24" r 87r~ 16 


Q 


1 

IS 


1 ' e 




Xi A*2 




X4— X2 


Yi — Yk 
* 4 J 5 


2'' 


1 

2 











2^ 


1 , S 2 , W c Tp) 

12 T 4i T 16 






L> , 
h t 


Zft L J 




yi-y 3 


X4—X3 


14 — 15 


1 „ 

a^ 








1 

y 





1 ,, 
2^ 


1 | 3S 2 . W f (p) 
16 ' 8tt ' 16 


n 
u 


2 

15 


A> 


4n' 4 l 


Xl — X 2 


Y-1-Y2 


X1—X3 


11 1 4 


P 


1 

2 




1 


1 
2 


1 ,. 


5 , Si , 3S 2 , W g (p) 
48" , "87^" , " 8tt "T , 16 


1 

9 


1 
3 


■* '71 


4nW 




'1"'2 


A3-A4 


y— y 


n 
r 


n 
u 


1 ,. 


n 
u 


1 

2 


2' 


1 1 5i , W h (p) 
24" r 87r" r „ 16 





1 
3 




4„[4] 


A"i — X2 




A3— A4 


y 3 -y2 


\P 


n 
u 


1 ,, 


1 „ 

2.P 


1 
2 


y 


J_+S2 








n W 


Xi-X 




A3-A4 


y 3 -Y" 4 


n 

r 


U 


n 
U 


u 


U 


n 
r 


1 , Si. I sl 
16^4^ 








2nM 


Xt —X2 


1 


At A2 


y-y 4 


hp 

1 1 


1 


1 ,, 

2^ 


1 


1 

2 


y 


1 1 5 2 

6 27T , , 






R\ 


4nM 


X1—X2 


Yi-Y 3 


X4— X\ 


y4-y 3 


y 


_ J_ 

2 





-y 


1 

2 


y 


1 , S 2 , Wi(p) 

ID »7T lo, 4 


_ 1 
9 





Rm 


«W 


X1—X2 


Y1-Y3 


X4 — X2 


y 4 -y 3 


y 


1 

2 








1 

2 


y 


5 , S 2 1 W m (p) 
48" r 47r" r 16 




— 


Rn 


2nM 


X1—X2 


Y1-Y3 


X2—X4 


y 2 -Yi 


y 


1 

2 


-p 





1 

2 


y 


1 Si ,S 2 , W„(p) 
48 Sir x 4tt x 16 
1 1 S 2 | l^ctp) 
16" r 27r" r 16 


1 

!) 





Rq 


«W 


X1—X2 


F1-Y3 


X4 — X3 


y 4 -y 2 


y 





y 


y 





y 





1 

3 


Rp 


2nM 


X1—X2 


n-Ya 


X2—X3 


y 2 -y 4 


p 


1 

2 


-y 


-y 


1 
2 


y 


1 1 Si 02 1 v ^plPJ 
48 ' 8?r 8tt ' 16 






Rq 


W 4 1 


X1—X2 


Yi-Ya 


X3 — X2 


y 3 -y 4 


p 


1 

2 





y 





y 


1 , Sj. , S 2 , W q lp) 
12 8tt 4ir 16 2 






R r 


3nf 3 l 


X1—X2 




X1-X3 


y-y 3 


p 


1 

2 




y 


1 

2 


p 


1 1 Si 1 S2 1 s x S 2 
9 T 4i T 47r T 4tt' 2 4tt' j 






Rs 


„[3] 


X1—X2 


Ki-y 3 


Xi — X2 


y-y 3 


y 


1 


|p 


y 


1 


y 


1 , s 2 

4 2 2 






Rt 




X1—X2 


Ki-y 3 


X3— Xi 


y 3 -y 2 


y 


1 

2 


y 


-p 


1 

2 


y 


1 Si 1 3S2 s x s 2 
36 Stt" 1 " 821- T 8?r 2 






Ru 


4„[3] 


X1—X2 


yi-y 2 


Xi— X2 


y-y 3 


p 


1 


y 


p 


1 

2 


y 


1 1 Si , S2 

6 f 4tt 4 3 2 






Rv 


4nl 3 l 


X\~X2 


yi-y 2 


X3— Xi 


y 3 -y2 


p 


1 

2 


y 


-y 


1 
2 


y 


1 1 Si 1 S2 r Sj S 2 

18 8tt ' 8tt ' 8tt 2 2 8it? 






Rw 


2nl 3 l 


X1—X2 


Y1-Y2 


X3— Xi 


y 3 -yi 


p 


1 

2 


-y 


-y 


1 
2 


p 


1 1 Si S 2 1 S x S 2 
36" r 47r 4tt T 4tt 2 4tt 2 










Xl—X2 


Y 1 -Y 2 


Xi— X2 


y-Ya 


p 


1 


p 


p 


1 


p 


1 , Si 
4 t 2i 







* The symbol nM is a compact notation of n(n — 1) • • • (n — K + 1), with ft being a positive integer. 

t The orthant probability P°(Z 1 ,Z 2 , Z 3 , Z4) = £ {H(Z 1 )H(Z 2 )H(Z 3 )H(Z4)}. Notations Si = sin" 1 p and 5 2 = sin" 1 \p are used for brevity. 



Appendix A 
Proof of TheoremQ] 

Proof: Using the technique developed by Moran P4l for finding E(rs), it follows that the ranks can be 
expressed as 

n 

P^Y^niXi-X^ + l (66) 

n 

Q t = Y. H( - Y * - + 1 (67) 

k=l 

where H(k) is defined in (0. Substituting ( |66] l and ( |67] i into © yields 

5-i(n-l) 2 12 
rs = ^ (68) 
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where 

n n n 

s = E E E H ^ - x j) H ( y i - Y k) 

i=i j=i k=i 

n n 

= E E^( X ' - *i)B(Y< - Y 3 ) (69) 

n n n 

+ E E - - **)• 

Then 

144 



V(r s ) = 



E(S 2 ) -E^S) 



n 2 (n 2 — l) 2 

V v ' 

V(S) 

Taking the expectation of both sides of d69l with the assistance of (O in Lemma [2] it follows readily that 



(70) 



E ^=^ J U + iJ +nlJ U + iJ (71) 

where ?-J K l = n(n— 1) • • • (n—n+1), with k being a positive integer. Now the variance of r$ depends on the 
evaluation of E(<S 2 ), which is a weighted summation of 24 quadrivariate normal orthant probabilities P^(R^) — 
E(Z 1 Z 2 Z 3 Z 4 ) corresponding to listed in Table HVlll29ll . Collecting the terms of P 4 °(i? 5 ) in Table |IV] subtracting 
the square of the right side of dTll and substituting the resultant into < T70b along with some simplifications, we obtain 
the expression of d20t with 

n 1 (p)=W c +4W d +2W e +2Wf (72) 
n 2 (p)=i(W g +W h +Wi+W q )+2(W n +W p )+W m +W . (73) 
An application of the relationship ( fTTT i to Appendix 2 of [29] yields 

W e =2W d , W g =W p , W h =W q , and W m =2Wi + \. (74) 

Substituting d74b into d72b and ( f73l yields (fJTJ and (F22l . respectively. Hence the first theorem statement (f20b follows. 
Ignoring the o(n _1 ) terms in (|20l > yields the second statement (|23l , thus completing the proof. ■ 

Appendix B 

Derivations of fii(p), n 2 (p) and f^Qo) for p = 0, 1 

Proof: From (fJTJ, ( f22l and (l30i l. it suffices to evaluate W^, £' € {c,d, f,g,h,l,n,o} for p = 0, 1; and 
with d34i >. it suffices to evaluate P 4 (R^') for p = 0,1. It follows readily from Appendix 2 of ||29l that for 
p = 0, P 4 °(i? c ) = Pi(R g ) = 1/9, P2{R d ) = P2{R h ) = 1/24, P2(R f ) = P 4 (i? o ) = 1/16, P 4 °(i?,) = 1/18, 
P°{R n ) = 1/36. Then, with the help of d34t. we have the values W^(0) as listed in the MOVcolumn of Table HVl 
Using these W?(0) values with the relationships (EB, (ED and ([30j yields fti(0) = 1/9, Q 2 (0) = 5/9 and 
Q 3 (0) = 1/18, respectively. 

When p approaches unity, it is rather tricky to evaluate the values Wj'(l). Substituting p = 1 directly into 
the integrals in (13[ or the integrals in Appendix 2 of [29] will not lead to any tractable argument. We have 
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to investigate case by case. From Table IIV1 it is seen that the off-diagonal elements of R c are all 1/2 When 
p = 1. Then we have P$(R c )\ p=1 = 1/5 02), and hence W c (l) = 1/5 by (04). From [47 1 it follows that 
P°{Rf)\ P =i = 2/15 and P 4 °(i? m )| P =i = 1/6. Then we have, by 01, W/(l) = 2/15 and W m (l) = 1/3, the latter 
yielding W^(l) = from the identity W m — Wi + 1/3 in (l74l i. Substituting Rj\ p=1 into (fT2l and exchanging zi 
and 2 2 gives W/(l) = W e (l)> which implies that W d {l) = 1/15 by the identity W e = 2W d in (l74l . Similarly we 
also have Wq(1) = W m (l) = 1/3 upon substitution of R m \p=i into ( fT2l and exchange of z% and 24. It is easy to 
verify that P°{R n ) vanishes as p ->• 1, since in this case Zi = -Z4 and H{Zi)H{Z 2 )H{Z^)H{Z A ) = by the 
definition of H(A) in (0. Then W„(l) = by applying the relationship d34l i once more. When p approaches unity, 
it follows that P®(R g ) and P®(Rh) degenerate to two trivariate normal orthant probabilities that have closed form 
expressions C[0]l. Specifically, it follows that P i (R g ) p ^i = 1/4 and P A (R h )\ p ^i = 1/8, yielding W g (l) = 1/3 
and Wh(l) — 1/3, respectively. Having all the values of W^(l), as listed in the W / (l)-column of Table ITVl and 
the three relationships (fJTJ, d22t and d30l l. we obtain fii(l) = 1, SI2 (1) = 16/3 and ^3(1) = 1/2, respectively, and 
the evaluations complete. ■ 



Appendix C 
Proof of Theorem[2] 

Proof: Let S be the same as in d69l and T be the numerator of ([3J. Define 

iAj^HiXi-XJHOTi-Yj) 

J 4 ^H(Xi - Xj)H(Yi - Y k ) 
Then, we have, from (01, d68l l and d69l along with the relationship sgn(A) = 2H(k) — 1, 



(75) 
(76) 
(77) 
(78) 



S = I + J 

T = 4J - 2K - 2L + n [2] 



(79) 
(80) 



and hence 



C(r s ,r x ) 



From ([8]) and (O, it follows that 



12 



n 2 (n — l)(n 2 — 1) 
12 

n 2 (n — l)(w 2 — 1) 



C(S,T) 
E(ST) - E(S)E(T) 



E(J) = Q + |l) and E(JQ = E(L) = ^. 



(81) 
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TABLE V 

QUANTITIES FOR EVALUATION OF E(ST) IN THEOREM^ 



Representative Term . . _ . _ , 

Corr. Matrix No. of Terms* P°{Zi, Z 2 , Z 3 , Z 4 )1 Pg(Z 2 , Z 3 , Z 4 )l P$(Z 4 ,Z 3 ,Z 4 y 

Zi z 2 z 3 z 4 



Rb 


n [5] 


A'i 


-A 2 


Yi 


-Y 2 


X 3 


-x 4 


Y 3 


-Y B 


1 _i_ S\ , S-2 
16 K7T 871 


| S1S2 
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8 47T 




1 1 s 2 
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Rg 


»w 
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Vi 


-Y 2 


Xy 


-x 3 


Yy 


-Y 4 


5 1 Si 1 3S2 
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^ 16 




1 1 s 2 

6 ~ 2tt 




1 1 s 2 

6 ~ 2tt 


Rh 


nW 


A'i 


-A 2 


Vi 


-Y 2 


x 3 


-x 4 


Y 3 


-Yi 


1 ,s 1 + 

24 T 8tt t 


16 




1 1S2 
12" r 47r 




1 

8 


Rh 


nW 


Xi 


-A 2 


>i 


-Y 2 


x 3 




Y 3 


-Y 4 


1 ,s 1 + 

24 T 8tt T 


16 




1 , s 2 

12" r 47r 




1 

8 


Hp 


nW 


Ai 


-A 2 


Vi 


-Y 2 


X 2 


-x 3 


Y 2 


-Y 4 


1 , Si S 2 
48 ~ 8tt 8tt 


, W v {p) 
^ 16 




1 

12 




1 

12 


Rq 


nW 


Ai 


-A 2 


Yi 


-Y 2 


x 3 


-Xi 


Y 3 


-Yi 


1 1 Si ,52 
12 """Sir" 1 " 4vr 


,W,(p) 
"f" 16 




1 , 5 2 
8~ r 2-7r 




1 , s 2 

6~ r 4-7r 


Rq 


nW 


Ai 


-A 2 


Yi 


-Y 2 


x 3 


-x 4 


Y 3 


-Yi 


1 1 Si , 52 
12 ~ 8tt ~ 4tt 


,W q (p) 
~*~ 16 




1 , s 2 

8~ r 2-7r 




1 , s 2 

6~ r 4-7r 


Rv. 


t»M 


Ai 


-X 2 


Yi 


-Yi 


Ai 


-Xi 


Yy 


-Yi 


6 ' 4?r n 


.S2 

47T 


1 

6" 


1 Si 1 52 
r 4tt ~ 4tt 




1 , s 2 

4" r 2-7r 


Rv. 


r»M 


Ai 


-X 2 


Yi 


-Y 2 


Xy 


-A 3 


Yy 


-Yi 




_S2 
4-7T 


1 

6" 


, 5i , 5 2 
r 4tt 4tt 




1 , s 2 

4" r 2-7r 


R v 


»M 


Ai 


-x 2 


Yi 


-Y 2 


x 3 


-X 1 


Y 3 


-Yi 


1 1 Si I 52 1 
18 ~ 8tt ' Sir ' 


Si s 2 

Stv 7 Stv' 2 




1 

(i 




1 , s 2 

12 t 2tt 


R v 




Ai 


-x 2 


Vi 


-Yi 


x 3 


-Xi 


Y 3 


-Yi 


1 1 Si 1 52 1 
18 ~ r 8tt 8ir 


sf s 2 

8ir 2 8ir 2 




1 

6 




1 , s 2 

12 t 2tt 


Ri 




A'l 


-x 2 


Vi 


-Yi 


x 2 


-Ai 


Yi 


-Y 3 







1 

12 


Si 1 S 2 

47T 47T 







R2 


nM 


Ai 


-x 2 


Vi 


-Yi 


x 2 


-x 3 


Yi 


-Yi 












1 

12 


Si 1 s 2 

47T 47T 




Rj 


r»W 


Xy 


-A 2 


Yy 


-Yi 


x 3 


-x 4 


Y 3 


-Y 4 




e2 
a l 

47T 2 




l.Si 
8~ r 47r 




1 , Si 

8^~47T 


Rr 




Ai 


-A 2 


Yi 


-Y 2 


Xy 


-A 3 


Yy 


-Y 3 


1 1 Si 1 52 1 
9 4tt 4tt 


sf _ s 2 

47T 2 47T 2 


1 

6" 


1 Si 1 5 2 
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1 

6" 


1 Si _|_ S2 
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R r 




Ai 


-A 2 


Yi 


-Yi 


x 3 


-Xi 


Y 3 


-Yi 


1 1 Si 1 52 1 
9 4ir 4vr 


Si 2 _ s 2 


1 

6" 
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r 4tt 4tt 


1 

6" 


■ Si _|_ S2 
r 4tt " r 4tt 


Rw 




Ai 


-A 2 


Vi 


-Yi 


x 3 


-x 1 


Y 3 


-Yi 


1 1 Si 52 1 
36 """471 4tt 


s? _ s 2 

47T 2 47T 2 


1 

12 


1 Si 5 2 

4-7T 47T 


1 

12 


1 Si 32 
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Rw 




Ai 


-A 2 


Yy 


-Yi 


x 2 


-x 3 


Y 2 


-Y 3 


1 1 Si 52 1 
36 """471 4tt 


s 2 _ s 2 

47T 2 4 77 ' 2 


1 

12 


1 Si 5 2 

4-7T 47T 


1 

12 


1 Si S2 

47T 47T 


Rx 


f»P] 


Ai 


-A 2 


Yy 


-Yi 


Xy 


-Xi 


Yy 


-Yi 


1 , Si 
4~ r 2tt 




l.Si 
4" r 2-7r 




1 , Si 
4" 1 " 2tt 



* The symbol is a compact notation of n(n — 1) — (n — K + 1), with k being a positive integer. 

tP 4 °(Zi,Z2,Z 3 ,Z 4 ) ^ £{H{Z 1 )H(Z 2 )H{Z 3 )H(Z 4 )}, P°(Z 2 ,Z 3 ,Z 4 ) ± £ {H(Z 2 )H(Z 3 )H(Z 4 )}, and P 3 ° (Zi , 2 3 , Z 4 ) ± 
£ {H(Zi)H(Z 3 )H(Z 4 )}. Notations S± = sin" 1 p and 52 = sin -1 ip are used for brevity. 



Substituting these expectation terms into ( fSOb gives 

E (r)=4n^( 1 J + -p)-n^ = ^S 1 . (82) 
y 4 Ztt J u 

Recall that we have obtained E(«S) in ( |7il . Now the only difficulty lies in the evaluation of E(6>T) in ( 18 It , 
Multiplying (|79] l and (1801 1. expanding and taking expectations term by term, we have 

E(ST) = 4E(JJ) - 2E(AV) - 2E(L J) 

(83) 

+ 4E(/ 2 ) - 2E(i^7) - 2E(LI) + n [2] E(S). 
Now, resorting to Table |V] we are ready to evaluate the first six terms in (183V From (F75T > and ( f76l >. it follows that 
E(/J) is a summation of P° terms of the form 

EiHiXi-X^HiYi-Y^HiXk^Hin-Y^}. (84) 
Zi Z 2 z 3 z 4 

Since, by definition (0, -ff(O) = 0, the term d84l vanishes for i = j ot k = I ot k = m. Then there are 
n 2 (n — l) 2 (n — 2) nontrivial d84l-like terms left to be evaluated. It follows that the domain of the quintuple 
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(i,j,k,l,m) can be partitioned into thirteen disjoint and exhaustive subsets whose representative terms, Z\, Z 2 , 
Z%, Z±, are listed in the upper panel of Table [Vl Summing up the corresponding -terms in Table IVl leads directly 
to E(JJ). In a similar manner we can obtain E(KJ) and E(LJ). With the assistance of the lower panel of Table IVl 
we also have the expressions of E(/ 2 ), K(KI) and K(LI). Substituting these results and (|7TT i into (1831 . subtracting 
the multiplication of (TTTb and d82b and substituting the resultant back into (fSTb . we find that C(r5,r^) is of the 
form d28]l with 

^{p) = \w g + \w p + \w h + \w q 

which simplifies to (l3Qb by applying the identities in d74} , The theorem then follows. ■ 

Appendix D 
Proof of TheoremE] 

Proof: For ease of the following discussion, we will use <fi(x, y) and ip(x, y) to denote the pdfs of the two 
bivariate normal components in (t40t . respectively. From ([66}, d67l > and tISOt . it follows that the numerator of © T 
can be simplified to 

n n 

r=ij2 J2 H ( Xi - x i) H ( Y i - y j) - nl2] < 85 > 

which yields 

E(T) = 4n [2] E - X 2 )ff {Y 1 - Y 2 )] -n [2] (86) 

Ei 

by the i.i.d. assumption. To evaluate E\ in d86l l. we need the joint distribution of {Xi,Y\,X 2 ,Y2), denoted by 
tp(xi,yx,x 2 ,y 2 ), which is readily obtained as 

ip = [(1 - e)4> x + eipi] [(1 - e)(f>2 + eip 2 ] 

(87) 

= (1-e) 2 0i02 + e(l— e) 0r02 + e) 02^1 + V^V^ 

ai a 2 <P2 a 3 1(33 Q4 ¥>4 

where 0,, ^ are compact notations of (p(xi, yi, x 2 , J/2), <f>{%i>yi) and ip(xi,yi), i = 1,2, respectively. Write 



[/ 4 , 1 ^ and V 



^V(Xi - X 2 ) v/v(ri - F2) 

Then, with respect to <£>i, ^2, if3, and (^4 in (l87l i. (U, V) follows four standard bivariate normal distributions with 
correlations 

Qi=P (88) 

P + X x Xyp' 1 . , , on , 

q 2 = , ==-=== ^ p as A x , A y -> 00 (89) 

Vl + A x V! + A y 

p + AxAyp' , 
g 3 = , 7= ^ P aS V -> 00 (90) 

v/l + A xV /l + A^ 

£4 = p' (91) 
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respectively. An application of the Sheppard's theorem (0 to d86l l along with d88ll-(l9TTi yields 

E(r)=W 2 ]^a l Q + ^sin- 1 ft 



4 2ir 

i=l x 



(92) 

2nl2] r • -l • -l .-in 

= ai sin p+2a2 sin 02+^4 sin p . 

71" 

Now it is not difficult to verify that the first statement ( |4TT > holds by 1) dividing both sides of d92l by ttJ 2 !, 2) letting 
Ax — > oo and Ay — > oo, and 3) ignoring the 0(e 2 ) terms. 

To prove the second statement (l42l . it suffices to evaluate E(6>) by the relationship ((68). Taking expectations of 
both sides in d69l along with the i.i.d. assumptions gives 

E(5) = n [2] E l + n [3] E [H(X 1 - X 2 )iZ"(Yi - Y 3 )} . (93) 

V v ' 

Since we have known E\ in the above development, now we only need to work out E2 in d93l . Let K7(xi, yi, x%, y 2) X3, 2/3), 
abbreviated as tu, denote the pdf of the joint distribution of (X\, Yi, X2, Y2,Xs, Yjj). Then, from (l40b and the i.i.d. 
assumption, 

w = [(l-e)^i+e^i] [(l-e)0 2 +eV 2 ] [(l-e)0 3 +#3] 

= (1-e) 3 0102^3 +e(l-e) 2 (</>i (^2^3 + </ ) iV'2^3 +^10203) 



(94) 



+ £ 2 ( 1 -e) Qfti -02 ^3 + V'l 02 ^3 + V^2 <fo ) + e 3 ^iV>2^3 ■ 

H75 CTg tUg 

where ^ and ^ are compact notations of <j){xi,yi) and ip(xi,yi), i — 1,2,3, respectively. Define 



- Y 3 ) 

Then, with respect to w\ to njg in (1941 , (U, V) follows 8 standard bivariate normal distributions with correlations 

Q 5 = \ (95) 
1 p 

£fi = iK—f^^f ^ as Ay ^ oo (96) 
V2 v/l + A 2 , 

g 7 = -L— JL== -> as Ax -> oo (97) 
V2 v 71 + A x 

AxAyP -> p' as Ax, Ay ^ oo (98) 



2 



Vl + A^Vl + Ay 

g 9 = J. == -> as A x , Ay ^ oo (99) 

^1 + A^^l + Ay 

gio = -^-/ ^^V, ^ 4= as A x ^ oo (100) 
V2 v/1 + A 2 ^ V2 

QXX = — - XYP „ -» — ^ Ay -» oo (101) 

v^^TTa^ V2 

012 = |. (102) 
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Using the Sheppard's theorem (O again together with d94|>-( 1102| i, we can obtain the expression of E2 and hence 
E(S) in terms of n, e and g% to gi2- Substituting E(<S) into d68l l. letting n, Xx, Ay —> 00 and ignoring the 0(e 2 ) 
terms, we arrive at d42i >. the second theorem statement. ■ 
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